AITopics | second-order approximation

Collaborating Authors

second-order approximation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EvaluatingRobustnesstoDatasetShift viaParametricRobustnessSets

Neural Information Processing SystemsFeb-9-2026, 15:06:24 GMT

These shifts are defined via parametric changes in the causal mechanisms of observed variables, where constraints on parameters yield a "robustness set" of plausible distributions and acorresponding worst-case loss overthe set.

approximation, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Bridging Discrete and Backpropagation: Straight-Through and Beyond Liyuan Liu Chengyu Dong Xiaodong Liu Bin Y u Jianfeng Gao Microsoft Research

Neural Information Processing SystemsFeb-9-2026, 06:33:46 GMT

Backpropagation, the cornerstone of deep learning, is limited to computing gradients for continuous variables. This limitation poses challenges for problems involving discrete latent variables. To address this issue, we propose a novel approach to approximate the gradient of parameters involved in generating discrete latent variables. First, we examine the widely used Straight-Through (ST) heuristic and demonstrate that it works as a first-order approximation of the gradient. Guided by our findings, we propose ReinMax, which achieves second-order accuracy by integrating Heun's method, a second-order numerical method for solving ODEs. ReinMax does not require Hessian or other second-order derivatives, thus having negligible computation overheads. Extensive experimental results on various tasks demonstrate the superiority of ReinMax over the state of the art.

approximation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.61)

Add feedback

Evaluating Robustness to Dataset Shift via Parametric Robustness Sets

Neural Information Processing SystemsDec-24-2025, 09:56:50 GMT

We give a method for proactively identifying small, plausible shifts in distribution which lead to large differences in model performance. These shifts are defined via parametric changes in the causal mechanisms of observed variables, where constraints on parameters yield a robustness set of plausible distributions and a corresponding worst-case loss over the set. While the loss under an individual parametric shift can be estimated via reweighting techniques such as importance sampling, the resulting worst-case optimization problem is non-convex, and the estimate may suffer from large variance. For small shifts, however, we can construct a local second-order approximation to the loss under shift and cast the problem of finding a worst-case shift as a particular non-convex quadratic optimization problem, for which efficient algorithms are available. We demonstrate that this second-order approximation can be estimated directly for shifts in conditional exponential family models, and we bound the approximation error. We apply our approach to a computer vision task (classifying gender from images), revealing sensitivity to shifts in non-causal attributes.

dataset shift, name change, parametric robustness set, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.98)

Add feedback

Outlier Detection of Poisson-Distributed Targets Using a Seabed Sensor Network

Kim, Mingyu, Stilwell, Daniel, Jimenez, Jorge

arXiv.org Artificial IntelligenceAug-19-2025

This paper presents a framework for classifying and detecting spatial commission outliers in maritime environments using seabed acoustic sensor networks and log Gaussian Cox processes (LGCPs). By modeling target arrivals as a mixture of normal and outlier processes, we estimate the probability that a newly observed event is an outlier. We propose a second-order approximation of this probability that incorporates both the mean and variance of the normal intensity function, providing improved classification accuracy compared to mean-only approaches. We analytically show that our method yields a tighter bound to the true probability using Jensen's inequality. To enhance detection, we integrate a real-time, near-optimal sensor placement strategy that dynamically adjusts sensor locations based on the evolving outlier intensity. The proposed framework is validated using real ship traffic data near Norfolk, Virginia, where numerical results demonstrate the effectiveness of our approach in improving both classification performance and outlier detection through sensor deployment.

approximation, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.13099

Country: North America > United States > Virginia > Norfolk City County > Norfolk (0.25)

Genre: Research Report > New Finding (0.66)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Networks > Sensor Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.61)

Add feedback

Evaluating Robustness to Dataset Shift via Parametric Robustness Sets

Neural Information Processing SystemsAug-15-2025, 15:18:10 GMT

accuracy, approximation, international conference, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Evaluating Robustness to Dataset Shift via Parametric Robustness Sets

Neural Information Processing SystemsOct-11-2024, 11:50:58 GMT

We give a method for proactively identifying small, plausible shifts in distribution which lead to large differences in model performance. These shifts are defined via parametric changes in the causal mechanisms of observed variables, where constraints on parameters yield a "robustness set" of plausible distributions and a corresponding worst-case loss over the set. While the loss under an individual parametric shift can be estimated via reweighting techniques such as importance sampling, the resulting worst-case optimization problem is non-convex, and the estimate may suffer from large variance. For small shifts, however, we can construct a local second-order approximation to the loss under shift and cast the problem of finding a worst-case shift as a particular non-convex quadratic optimization problem, for which efficient algorithms are available. We demonstrate that this second-order approximation can be estimated directly for shifts in conditional exponential family models, and we bound the approximation error.

dataset shift, parametric robustness set, second-order approximation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.43)

Add feedback

Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion

Rout, Litu, Chen, Yujia, Kumar, Abhishek, Caramanis, Constantine, Shakkottai, Sanjay, Chu, Wen-Sheng

arXiv.org Machine LearningDec-1-2023

Sampling from the posterior distribution poses a major computational challenge in solving inverse problems using latent diffusion models. Common methods rely on Tweedie's first-order moments, which are known to induce a quality-limiting bias. Existing second-order approximations are impractical due to prohibitive computational costs, making standard reverse diffusion processes intractable for posterior sampling. This paper introduces Second-order Tweedie sampler from Surrogate Loss (STSL), a novel sampler that offers efficiency comparable to first-order Tweedie with a tractable reverse process using second-order approximation. Our theoretical results reveal that the second-order approximation is lower bounded by our surrogate loss that only requires $O(1)$ compute using the trace of the Hessian, and by the lower bound we derive a new drift term to make the reverse process tractable. Our method surpasses SoTA solvers PSLD and P2L, achieving 4X and 8X reduction in neural function evaluations, respectively, while notably enhancing sampling quality on FFHQ, ImageNet, and COCO benchmarks. In addition, we show STSL extends to text-guided image editing and addresses residual distortions present from corrupted images in leading text-guided image editing methods. To our best knowledge, this is the first work to offer an efficient second-order approximation in solving inverse problems using latent diffusion and editing real-world images with corruptions.

artificial intelligence, editing, machine learning, (18 more...)

arXiv.org Machine Learning

2312.00852

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > China > Yunnan Province > Kunming (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Understanding the Effects of Second-Order Approximations in Natural Policy Gradient Reinforcement Learning

Gebotys, Brennan, Wong, Alexander, Clausi, David A.

arXiv.org Artificial IntelligenceOct-10-2022

Natural policy gradient methods are popular reinforcement learning methods that improve the stability of policy gradient methods by utilizing second-order approximations to precondition the gradient with the inverse of the Fisher-information matrix. However, to the best of the authors' knowledge, there has not been a study that has investigated the effects of different second-order approximations in a comprehensive and systematic manner. To address this, five different second-order approximations were studied and compared across multiple key metrics including performance, stability, sample efficiency, and computation time. Furthermore, hyperparameters which aren't typically acknowledged in the literature are studied including the effect of different batch sizes and optimizing the critic network with the natural gradient. Experimental results show that on average, improved second-order approximations achieve the best performance and that using properly tuned hyperparameters can lead to large improvements in performance and sample efficiency ranging up to +181%. We also make the code in this study available at https://github.com/gebob19/natural-policy-gradient-reinforcement-learning.

approximation, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2201.09104

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

First-order and second-order variants of the gradient descent: a unified framework

Pierrot, Thomas, Perrin, Nicolas, Sigaud, Olivier

arXiv.org Machine LearningOct-18-2018

In this paper, we provide an overview of first-order and second-order variants of the gradient descent methods commonly used in machine learning. We propose a general framework in which 6 of these methods can be interpreted as different instances of the same approach. These methods are the vanilla gradient descent, the classical and generalized Gauss-Newton methods, the natural gradient descent method, the gradient covariance matrix approach, and Newton's method. Besides interpreting these methods within a single framework, we explain their specificities and show under which conditions some of them coincide. Machine learning generally amounts to solving an optimization problem where a loss function has to be minimized.

artificial intelligence, gradient descent, machine learning, (15 more...)

arXiv.org Machine Learning

1810.08102

Country:

North America > United States > New York > Nassau County > Mineola (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback